Motif Discovery in Tissue-Specific Regulatory Sequences Using Directed Information

نویسندگان

  • Arvind Rao
  • Alfred O. Hero
  • David J. States
  • James Douglas Engel
چکیده

Motif discovery for the identification of functional regulatory elements underlying gene expression is a challenging problem. Sequence inspection often leads to discovery of novel motifs (including transcription factor sites) with previously uncharacterized function in gene expression. Coupled with the complexity underlying tissue-specific gene expression, there are several motifs that are putatively responsible for expression in a certain cell type. This has important implications in understanding fundamental biological processes such as development and disease progression. In this work, we present an approach to the identification of motifs (not necessarily transcription factor sites) and examine its application to some questions in current bioinformatics research. These motifs are seen to discriminate tissue-specific gene promoter or regulatory regions from those that are not tissue-specific. There are two main contributions of this work. Firstly, we propose the use of directed information for such classification constrained motif discovery, and then use the selected features with a support vector machine (SVM) classifier to find the tissue specificity of any sequence of interest. Such analysis yields several novel interesting motifs that merit further experimental characterization. Furthermore, this approach leads to a principled framework for the prospective examination of any chosen motif to be discriminatory motif for a group of coexpressed/coregulated genes, thereby integrating sequence and expression perspectives. We hypothesize that the discovery of these motifs would enable the large-scale investigation for the tissue-specific regulatory role of any conserved sequence element identified from genome-wide studies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Development of an Efficient Hybrid Method for Motif Discovery in DNA Sequences

This work presents a hybrid method for motif discovery in DNA sequences. The proposed method called SPSO-Lk, borrows the concept of Chebyshev polynomials and uses the stochastic local search to improve the performance of the basic PSO algorithm as a motif finder. The Chebyshev polynomial concept encourages us to use a linear combination of previously discovered velocities beyond that proposed b...

متن کامل

Seeder: discriminative seeding DNA motif discovery

MOTIVATION The computational identification of transcription factor binding sites is a major challenge in bioinformatics and an important complement to experimental approaches. RESULTS We describe a novel, exact discriminative seeding DNA motif discovery algorithm designed for fast and reliable prediction of cis-regulatory elements in eukaryotic promoters. The algorithm is tested on biologica...

متن کامل

Molecular and Bioinformatics Analysis of Allelic Diversity in IGFBP2 Gene Promoter in Indigenous Makuee and Lori-Bakhtiari Sheep Breeds

The aim of this study was to perform molecular and bioinformatics analysis of IGFBP2 gene promoter in association with some economic traits in indigenous Makuee (MS) and Lori-Bakhtiari (LB) breeds. DNA was extracted from blood samples of 120 MS and 200 LB and a 297 bp fragment from the upstream sequences of studied gene was amplified and genotyped by single-strand conformational polymo...

متن کامل

Characterization of Binding Sites of Eukaryotic Transcription Factors

To explore the nature of eukaryotic transcription factor (TF) binding sites and determine how they differ from surrounding DNA sequences, we examined four features associated with DNA-binding sites: G+C content, pattern complexity, palindromic structure, and Markov sequence ordering. Our analysis of the regulatory motifs obtained from the TRANSFAC database, using yeast intergenic sequences as b...

متن کامل

Comparative Analysis of Regulatory Motif Discovery Tools for Transcription Factor Binding Sites

In the post-genomic era, identification of specific regulatory motifs or transcription factor binding sites (TFBSs) in non-coding DNA sequences, which is essential to elucidate transcriptional regulatory networks, has emerged as an obstacle that frustrates many researchers. Consequently, numerous motif discovery tools and correlated databases have been applied to solving this problem. However, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2007  شماره 

صفحات  -

تاریخ انتشار 2007